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substitution of claims 15-39 are believed to put the present 
application in better form for review by the Examiner. 
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The distribution function of speech is modeled as a 
GMM of time- correlated samples, leading to a distribution 
function for the speech spectral amplitude s(f) as shown in 
Equation 2, where 6(a) is a one-sided Dirac delta function. The 
first term on the right hand side (RHS) of Equation 2 represents 
a signal of zero power, thus capturing the possibility that no 
signal of interest is present. The components of the summation 
in the second term on the RHS of Equation 2 are the components 
of the GMM model for the speech distribution function. 

Equation 2 
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This speech model has two sets of frequency band 
dependent pp grs which are^^^T^ ^^M^ 
processing rjP.f^i an>tg^f ) j TtST first is the a priori PSD of 
^-gpe^chT^Si^ini that a speech signal is present at the 
frequency and time of interest. The second ^paeamefeee^ is the a 
priori probability of a speech signal being present at that 
frequency and time. The speech distribution function also has a 
number of added parameters, {a I } = {a 1 ,a 2 , ...a J and {p°}=(p°>P°' 
...p N °}. The {aj are the weights of the N Gaussian components of 
the GMM, and the {p?} are the powers of each component when the 
speech PSD is normalized to P s (f) =1. In practice, P s (f) and {p°j 
are combined into a parameter set denoted as {pi(f)j, where p t (f ) 
= p° P s (f). 
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While both the P s (f) and q s (f) are dynamically updated 
during the processing, the {a ± } and (are)) {pf} are determined from 



prior "training" to optimize processing results as averaged over 
a representative body of training data. This may t ypically be 
5 done by minimizing the mean- squared- error (MSE) between noise 
free signals and the results from processing noisy input signals 
based on those signa ls by mixing with va rying types and levels 
of interfering n oise . The present invention may typically use 

r-~ ■ " 

five GMM components (denoted GMM5) . However, more or less than 
10 five components can be employed. In addition, the {a £ } may be 
further parameterized by the values of other key quantities, 
including but not limited to signal-to-noise ratio (SNR) , which 
are adaptively and dynamically updated throughout the processing. 
This may typically be done by determining different GMM model 

15 parameter values {the fa J and {p°}) versus SNR based on training 
for different input SNRs, and interpolating between these model 
parameter values based on the adaptively estimated input 3NR 
during the jrocassinq , One prior training of a GMM5 leads to a 
model for the speech distribution as shown in Figure 1 for q s = 

2 0 0.5. Also shown is the corresponding distribution function for 
a Gaussian speech model with q s = 1. For presentation purposes, 
the vertical axis is actually the distribution function for 
speech spectral power, which is simply f(s 2 /P s ), and the 
horizontal axis is (s 2 /P s )^. 

25 Noise PSD updating is mainly based on the following. 

Given a priori distribution functions for the noise and speech 
spectral amplitudes, and a new measurement of the noisy signal 
spectral amplitude, r(f), a determination is made as to a best 



The form of this estimator is depicted in Figures 4a 
and 4b. In these figures, the vertical axis is (<s 2 | r>/P N ) and 
the horizontal axis is (x?/p a )*. GMM5 results are given for 
different SNRs, a nominal speech distribution function at q s = 
0.5, and as compared with a Gaussian speech model at g s = l.o, 
and also an extended Gaussian modes at q s = 0.5. GMM5 results 
are in solid lines and Gaussian models are shown as dashed lines. 

In a manner similar to the previous explanation, the 
speech spectral amplitude can also be estimated as follows. 



Equation 13 




Note that in the special case with only one GMM component in the 
speech distribution function, and also with q s = 1, the above 
expression reduces to a conventional Wiener filter. 

For a typical set of GMM parameters, and at q s = 0.5, 
and for different SNRs, the form of this estimator is shown in 
Figures 5a and 5b, where it is also compared with a Wiener filter 
at g s = l.o, and also with an extended Wiener filter based on a 
Gaussian speech model but with q s = 0.5. In the figures, the 
vertical axis is <s\r>/ (P„) 1/2 , and the horizontal axis is 
(x*/P N ) 1/2 . 

It is further noted that the availability of separate 
estimates for both the speech spectral amplitude <s|r> and the 



